Sketching Meets Random Projection in the Dual: A Provable Recovery Algorithm for Big and High-dimensional Data

نویسندگان

  • Jialei Wang
  • Jason D. Lee
  • Mehrdad Mahdavi
  • Mladen Kolar
  • Nathan Srebro
چکیده

We provide a unified optimization view of iterative Hessian sketch (IHS) and iterative dual random projection (IDRP). We establish a primal-dual connection between the Hessian sketch and dual random projection, and show that their iterative extensions are optimization processes with preconditioning. We develop accelerated versions of IHS and IDRP based on this insight together with conjugate gradient descent, and propose a primal-dual sketch method that simultaneously reduces the sample size and dimensionality.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Recovering the Optimal Solution by Dual Random Projection

Random projection has been widely used in data classification. It maps high-dimensional data into a low-dimensional subspace in order to reduce the computational cost in solving the related optimization problem. While previous studies are focused on analyzing the classification performance of using random projection, in this work, we consider the recovery problem, i.e., how to accurately recove...

متن کامل

IMPROVED BIG BANG-BIG CRUNCH ALGORITHM FOR OPTIMAL DIMENSIONAL DESIGN OF STRUCTURAL WALLS SYSTEM

Among the different lateral force resisting systems, shear walls are of appropriate stiffness and hence are extensively employed in the design of high-rise structures. The architectural concerns regarding the safety of these structures have further widened the application of coupled shear walls. The present study investigated the optimal dimensional design of coupled shear walls based on the im...

متن کامل

A Deterministic Analysis of Noisy Sparse Subspace Clustering for Dimensionality-reduced Data

Subspace clustering groups data into several lowrank subspaces. In this paper, we propose a theoretical framework to analyze a popular optimization-based algorithm, Sparse Subspace Clustering (SSC), when the data dimension is compressed via some random projection algorithms. We show SSC provably succeeds if the random projection is a subspace embedding, which includes random Gaussian projection...

متن کامل

Towards Making High Dimensional Distance Metric Learning Practical

In this work, we study distance metric learning (DML) for high dimensional data. A typical approach for DML with high dimensional data is to perform the dimensionality reduction first before learning the distance metric. The main shortcoming of this approach is that it may result in a suboptimal solution due to the subspace removed by the dimensionality reduction method. In this work, we presen...

متن کامل

Sparse Learning for Large-Scale and High-Dimensional Data: A Randomized Convex-Concave Optimization Approach

In this paper, we develop a randomized algorithm and theory for learning a sparse model from large-scale and high-dimensional data, which is usually formulated as an empirical risk minimization problem with a sparsity-inducing regularizer. Under the assumption that there exists a (approximately) sparse solution with high classification accuracy, we argue that the dual solution is also sparse or...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017